A Survey on Script Segmentation for Bangla OCR
نویسندگان
چکیده
Script segmentation is an important primary task for any Optical Character Recognition (OCR) software. Especially, in case of off-line OCR for printed character, it has more importance. Through script segmentation a big image of some written document is fragmented into a number of small pieces which are then used for pattern matching to determine the expected sequence of characters. In the implementation of Bangla OCR, the script segmentation may also play a vital role. But, for accurate and proper segmentation it is necessary to identify the properties of Bangla script as well as the exceptions. This paper depicts the most important and useful properties, advantages, disadvantages of various Bangla scripts, especially the printed scripts. It also gives some ideas regarding the prospective field of Bangla OCR and its applications.
منابع مشابه
Bangla/English Script Identification Based on Analysis of Connected Component Profiles
Script identification is required for a multilingual OCR system. In this paper, we present a novel and efficient technique for Bangla/English script identification with applications to the destination address block of Bangladesh envelope images. The proposed approach is based upon the analysis of connected component profiles extracted from the destination address block images, however, it does ...
متن کاملHandwritten Segmentation in Bangla Script: A Review of Offline Techniques
Offline handwritten segmentation in Bangla is an interesting area of research as Segmentation has long been one of the most critical areas of optical character recognition process. Through this operation, an image of a sequence of characters, which may be connected in some cases, is decomposed into sub-images of individual alphabetic symbols. In this paper, segmentation of cursive handwritten s...
متن کاملAn improved offline handwritten character segmentation algorithm for Bangla script
Effective segmentation of offline handwritten word images of unconstrained handwritten Bangla script is a challenging problem in Optical Character Recognition (OCR) application. Presence of a continuous horizontal line called ‘Matra’ is an important feature of this script. However, in unconstrained cursive handwriting, Matra can be wavy or discontinuous, makes the problem of segmentation diffic...
متن کاملA Hybrid Approach to Classify Gurmukhi Script Characters
Researchers have worked extensively on OCR, in the past few decades. This is also visible from the fact that various types of OCR are available in the market. Out of these available OCR’s majority is to support foreign languages. In Indian context, majority of available OCR’s are for Hindi and Bangla, but a very few reports are available on Gurmukhi script which is used to write Punjabi languag...
متن کاملSegmentation of Touching Hand written Telugu Characters by using Drop Fall Algorithm
Recognition of Indian language scripts is a challenging problem. Work for the development of complete OCR systems for Indian language scripts is still in infancy. Complete OCR systems have recently been developed for Devanagri and Bangla scripts. Research in the field of recognition of Telugu script faces major problems mainly related to the touching and overlapping of characters. Segmentation ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007